AITopics | mid-level feature

Collaborating Authors

mid-level feature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

7486cef2522ee03547cfb970a404a874-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 09:25:06 GMT

generator, perturbation, transferability, (12 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland (0.04)
North America > Canada > Quebec > Montreal (0.04)

Industry: Information Technology > Security & Privacy (0.70)

Technology:

Information Technology > Security & Privacy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Learning Transferable Adversarial Perturbations

Neural Information Processing SystemsDec-24-2025, 07:34:05 GMT

While effective, deep neural networks (DNNs) are vulnerable to adversarial attacks. In particular, recent work has shown that such attacks could be generated by another deep network, leading to significant speedups over optimization-based perturbations. However, the ability of such generative methods to generalize to different test-time situations has not been systematically studied. In this paper, we, therefore, investigate the transferability of generated perturbations when the conditions at inference time differ from the training ones in terms of the target architecture, target data, and target task. Specifically, we identify the mid-level features extracted by the intermediate layers of DNNs as common ground across different architectures, datasets, and tasks. This lets us introduce a loss function based on such mid-level features to learn an effective, transferable perturbation generator. Our experiments demonstrate that our approach outperforms the state-of-the-art universal and transferable attack strategies.

electronic proceedings, learning transferable adversarial perturbation, name change, (3 more...)

Neural Information Processing Systems

Industry:

Information Technology > Security & Privacy (0.61)
Government > Military (0.61)

Technology:

Information Technology > Security & Privacy (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

7486cef2522ee03547cfb970a404a874-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 05:17:17 GMT

generator, perturbation, transferability, (12 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland (0.04)
North America > Canada > Quebec > Montreal (0.04)

Industry: Information Technology > Security & Privacy (0.70)

Technology:

Information Technology > Security & Privacy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Are Inherently Interpretable Models More Robust? A Study In Music Emotion Recognition

Hoedt, Katharina, Flexer, Arthur, Widmer, Gerhard

arXiv.org Artificial IntelligenceAug-7-2025

One of the desired key properties of deep learning models is the ability to generalise to unseen samples. When provided with new samples that are (perceptually) similar to one or more training samples, deep learning models are expected to produce correspondingly similar outputs. Models that succeed in predicting similar outputs for similar inputs are often called robust. Deep learning models, on the other hand, have been shown to be highly vulnerable to minor (adversarial) perturbations of the input, which manage to drastically change a model's output and simultaneously expose its reliance on spurious correlations. In this work, we investigate whether inherently interpretable deep models, i.e., deep models that were designed to focus more on meaningful and interpretable features, are more robust to irrelevant perturbations in the data, compared to their black-box counterparts. We test our hypothesis by comparing the robustness of an interpretable and a black-box music emotion recognition (MER) model when challenged with adversarial examples. Furthermore, we include an adversarially trained model, which is optimised to be more robust, in the comparison. Our results indicate that inherently more interpretable models can indeed be more robust than their black-box counterparts, and achieve similar levels of robustness as adversarially trained models, at lower computational cost.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.5281/zenodo.15837204

2508.0378

Genre: Research Report > New Finding (0.67)

Industry:

Media > Music (0.69)
Leisure & Entertainment (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning Transferable Adversarial Perturbations

Neural Information Processing SystemsOct-11-2024, 05:24:58 GMT

architecture, learning transferable adversarial perturbation, mid-level feature

Neural Information Processing Systems

Industry:

Information Technology > Security & Privacy (0.46)
Government > Military (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Improved Few-shot Segmentation by Redefinition of the Roles of Multi-level CNN Features

Wang, Zhijie, Suganuma, Masanori, Okatani, Takayuki

arXiv.org Artificial IntelligenceSep-14-2021

This study is concerned with few-shot segmentation, i.e., segmenting the region of an unseen object class in a query image, given support image(s) of its instances. The current methods rely on the pretrained CNN features of the support and query images. The key to good performance depends on the proper fusion of their mid-level and high-level features; the former contains shape-oriented information, while the latter has class-oriented information. Current state-of-the-art methods follow the approach of Tian et al., which gives the mid-level features the primary role and the high-level features the secondary role. In this paper, we reinterpret this widely employed approach by redifining the roles of the multi-level features; we swap the primary and secondary roles. Specifically, we regard that the current methods improve the initial estimate generated from the high-level features using the mid-level features. This reinterpretation suggests a new application of the current methods: to apply the same network multiple times to iteratively update the estimate of the object's region, starting from its initial estimate. Our experiments show that this method is effective and has updated the previous state-of-the-art on COCO-20$^i$ in the 1-shot and 5-shot settings and on PASCAL-5$^i$ in the 1-shot setting.

mid-level feature, pfenet, segmentation, (13 more...)

arXiv.org Artificial Intelligence

2109.06432

Country: Asia > Japan > Honshū > Tōhoku (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Towards Explainable Music Emotion Recognition: The Route via Mid-level Features

Chowdhury, Shreyan, Vall, Andreu, Haunschmid, Verena, Widmer, Gerhard

arXiv.org Machine LearningJul-8-2019

Emotional aspects play an important part in our interaction with music. However, modelling these aspects in MIR systems have been notoriously challenging since emotion is an inherently abstract and subjective experience, thus making it difficult to quantify or predict in the first place, and to make sense of the predictions in the next. In an attempt to create a model that can give a musically meaningful and intuitive explanation for its predictions, we propose a VGG-style deep neural network that learns to predict emotional characteristics of a musical piece together with (and based on) human-interpretable, mid-level perceptual features. We compare this to predicting emotion directly with an identical network that does not take into account the mid-level features and observe that the loss in predictive performance of going through the mid-level features is surprisingly low, on average. The design of our network allows us to visualize the effects of perceptual features on individual emotion predictions, and we argue that the small loss in performance in going through the mid-level features is justified by the gain in explainability of the predictions.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Machine Learning

1907.03572

Country: Europe (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sleep Stage Classification Based on Multi-level Feature Learning and Recurrent Neural Networks via Wearable Device

Zhang, Xin, Kou, Weixuan, Chang, Eric I-Chao, Gao, He, Fan, Yubo, Xu, Yan

arXiv.org Machine LearningNov-2-2017

Abstract--This paper proposes a practical approach for automatic sleep stage classification based on a multilevel feature learning framework and Recurrent Neural Network (RNN) classifier using heart rate and wrist actigraphy derived from a wearable device. The feature learning framework is designed to extract low-and mid-level features. Low-level features capture temporal and frequency domain properties and mid-level features learn compositions and structural information of signals. Since sleep staging is a sequential problem with long-term dependencies, we take advantage of RNNs with Bidirectional Long Short-T erm Memory (BLSTM) architectures for sequence data learning. T o simulate the actual situation of daily sleep, experiments are conducted with a resting group in which sleep is recorded in resting state, and a comprehensive group in which both resting sleep and non-resting sleep are included. We evaluate the algorithm based on an eightfold cross validation to classify five sleep stages (W, N1, N2, N3, and REM). V arious comparison experiments demonstrate the effectiveness of feature learning and BLSTM. We further explore the influence of depth and width of RNNs on performance. Our method is specially proposed for wearable devices and is expected to be applicable for long-term sleep monitoring at home. Without using too much prior domain knowledge, our method has the potential to generalize sleep disorder detection. Index Terms--Heart rate, Long Short-T erm Memory, Recurrent neural networks, Sleep stage classification, Wearable device. Xin Zhang, Weixuan Kou, Y ubo Fan and Y an Xu are with the State Key Laboratory of Software Development Environment and the Key Laboratory of Biomechanics and Mechanobiology of Ministry of Education and Research Institute of Beihang University in Shenzhen and Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing 100191, China (email: xinzhang0376@gmail.com; Eric I-Chao Chang, and Y an Xu are with Microsoft Research, Beijing 100080, China (email:echang@microsoft.com; xuyan04@gmail.com).

artificial intelligence, classification, machine learning, (19 more...)

arXiv.org Machine Learning

1711.00629

Country:

North America (1.00)
Asia > China > Beijing > Beijing (0.65)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Sleep (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Collaborative Receptive Field Learning

Kong, Shu, Jiang, Zhuolin, Yang, Qiang

arXiv.org Machine LearningFeb-2-2014

The challenge of object categorization in images is largely due to arbitrary translations and scales of the foreground objects. To attack this difficulty, we propose a new approach called collaborative receptive field learning to extract specific receptive fields (RF's) or regions from multiple images, and the selected RF's are supposed to focus on the foreground objects of a common category. To this end, we solve the problem by maximizing a submodular function over a similarity graph constructed by a pool of RF candidates. However, measuring pairwise distance of RF's for building the similarity graph is a nontrivial problem. Hence, we introduce a similarity metric called pyramid-error distance (PED) to measure their pairwise distances through summing up pyramid-like matching errors over a set of low-level features. Besides, in consistent with the proposed PED, we construct a simple nonparametric classifier for classification. Experimental results show that our method effectively discovers the foreground objects in images, and improves classification performance.

artificial intelligence, machine learning, submodular function, (15 more...)

arXiv.org Machine Learning

1402.017

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Recognizing Continuous Social Engagement Level in Dyadic Conversation by Using Turn-taking and Speech Emotion Patterns

Hsiao, Joey Chiao-yin (National Taiwan University) | Jih, Wan-rong (National Taiwan University) | Hsu, Jane Yung-jen ( National Taiwan University )

AAAI ConferencesJul-21-2012

Recognizing social interests plays an important role of aiding human-computer interaction and human collaborative works. The recognition of social interest could be of great help to determine the smoothness of the interaction, which could be an indicator for group work performance and relationship. From socio-psychological theories, social engagement is the observable form of inner social interest, and represented as patterns of turn-taking and speech emotion during a face-to-face conversation. With these two kinds of features, a multi-layer learning structure is proposed to model the continuous trend of engagement. The level of engagement is classified into “high” and “low” two levels according to human-annotated score. In the result of assessing two-level engagemet, the highest accuracy of our model can reach 79.1%.

engagement, participant, recognition, (14 more...)

AAAI Conferences

Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts (0.05)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback